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Abstract 

We revisit the classical population genetics model of a population evolving under multi- 
plicative selection, mutation and drift. The number of beneficial alleles in a multi-locus 
system can be considered a trait under exponential selection. Equations of motion are de- 
rived for the cumulants of the trait distribution in the diffusion limit and under the assump- 
tion of hnkage equilibrium. Because of the additive nature of cumulants, this reduces to the 
problem of determining equations of motion for the expected allele distribution cumulants 
at each locus. The cumulant equations form an infinite dimensional linear system and in 
an authored appendix Adam Priigel-Bennett provides a closed form expression for these 
equations. We derive approximate solutions which are shown to describe the dynamics 
well for a broad range of parameters. In particular, we introduce two approximate analyti- 
cal solutions: (1) Perturbation theory is used to solve the dynamics for weak selection and 
arbitrary mutation rate. The resulting expansion for the system's eigenvalues reduces to 
the known diffusion theory results for the limiting cases with either mutation or selection 
absent. (2) For low mutation rates we observe a separation of time-scales between the 
slowest mode and the rest which allows us to develop an approximate analytical solution 
for the dominant slow mode. The solution is consistent with the perturbation theory result 
and provides a good approximation for much stronger selection intensities. 
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I. INTRODUCTION 



When modelling the dynamics of a finite population it is necessary to consider the effects of 
fluctuations introduced by genetic drift. The most influential theoretical approach to this prob- 
lem has been to formulate a model of the allele frequency dynamics which incorporates drift and 
selection at the genie level (for models of molecular evolution we might be considering individ- 
ual nucleotides but we will adopt gene nomenclature and talk of the alternative alleles at each 
locus). The resulting model can be analysed using the theory of Markov chains or more usually 
by invoking a diffusion approximation (see, for example, |Crow and Kimura, 197(J| ; |Hwens, 1979| ). 
Alternatively, a number of theoretical models of selection have been developed which describe 
evolution at the phenotype level. These models typically describe the population by a small num- 
ber of descriptive statistics, such as moments or cumulants of a trait distribution. Such models 



are u seful when describing selection on quantitative traits (jBulmer, 1980|; [Biirger, 1991|; Dawson, 



I , , I 

1999; [Turelli and Barton, 1994D and have also been used to develop theories of asexual evolution 



in finite populations ( |Higgs and Woodcock, 199^ ; priigel-Bennett, 1997| ). [Burger (1991D provided 



one of the first analyses of polygenic dynamics using cumulants and in a recent book he provides a 

detailed account of the cumulant dynamics associated with various evolutionary models (Biirger, 

I I 

2000). Cumulants have also been used for modelling the population dynamics of genetic algo- 
rithms, where an accurate characterisation of finite population effects is crucial (see, for example, 
Priigel-Bennett and Shapiro, 1994| ; piattray and Shapiro, 1996| ). Cumulants have a number of 



features which make them attractive for modelling polygenic dynamics. Cumulants of increasing 
order will often have decreasing impact, a feature not shared by the distribution moments. Another 
important feature is that cumulants are additive, in the sense that the cumulants of a sum of random 
variables is equal to the sum of the cumulants of each random variable. Standard cumulants are 
most appropriate when describing deviations from a Gaussian trait distribution ( Stuart and Ord, 
1987) while factorial cumulants can be useful for distributions close to Poisson ( pawson, 19^9[ ). 



In this work we explore the relationship between the allele frequency dynamics and trait dis- 
tribution dynamics using a diffusion approximation. Dynamical equations for the trait distribution 
cumulants are derived for a finite population undergoing multiplicative selection and reversible 
mutation, assuming recombination maintains linkage equilibrium. Because of the additive nature 
of cumulants, this reduces to the problem of determining equations of motion for the expected 
allele distribution cumulants at each locus. This is a classical model in population genetics for 
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which the non-equilibrium diffusion theory is not solved for the general case, although solutions 
exist for the restricted cases when either mutation or selection is absent ( [Crow and Kimura, 19701 ). 
Analysis of the cumulant dynamics allows a new perspective as well as some new results. The 
cumulants form an infinite dimensional linear dynamical system and two approximate analytical 
solutions are found which accurately describe the dynamics for a broad range of parameter values. 
In the first case we use perturbation theory to solve the dynamics for weak selection and arbitrary 
mutation rate. The perturbation expansion is consistent with Kimura's eigenvalue expansion for 
the zero mutation case ( jKimura, 1955| ). Secondly, we show how a separation of time- scales be- 
tween the leading mode and the rest can be used to develop an approximation to the dynamics 
which holds for low mutation rates and a greater range of selection intensities. This separation of 
time-scales allows a remarkably simple and novel approximation to the transient dynamics under 
selection and weak mutation. The theory is compared to simulation results of finite populations 
undergoing free recombination, showing good agreement. 

II. EQUATIONS OF MOTION 

A. The model 

We consider a haploid population which is assumed to be at linkage equilibrium. The popula- 
tion has two alleles Xi G {1, 0} at each of L loci distributed according to the following factorised 
distribution, 

L 

(t^i.^) = n + (1 - Vi)5x,fl) , (1) 

i=l 

where (p^x) represents the probability mass function and 5^ is the Kronecker delta. Each pop- 
ulation member is labelled n = 1,2, . . . , N where N is the population size and we consider 
multiplicative selection with the fitness defined, 

Wr. = f[{l~Sy~^" . (2) 

i=l 

We also include symmetric mutation with rate u at each generation. 

B. Allele frequency dynamics 

If we take s ~ 0{N^^) and u ~ 0{N^^) then for large the allele frequency at each locus is 
subject to an independent diffusion process. We identify a = sN and (3 = uN to be the relevant 
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dimensionless quantities of order unity. The dynamics is completely determined by the mean 
and variance of the allele frequency change each generation and under the standard multinomial 
selection model we find (see, for example, pwens, 1979D , 

E{Api\pi) = a{pi)6t where a{pi) = api{l - Pi) + - 2pi) , 
Var(Api|pi) = h{pi)5t where h{pi) = pi{l - pi) , (3) 

to leading order in 5t, where 5t = N^^ defines an increment of time. The diffusion limit is 
obtained as 5t 0. The covariance between allele frequencies at different loci is 0{s^5t) and is 
therefore negligible in this limit, so we are justified in treating each locus independently as long 
as recombination is sufficient to maintain linkage equilibrium. In the diffusion framework the 
Kolmogorov forward equation (also known as the Fokker-Planck equation) describes the evolution 
of the allele frequency distribution over time, 

where t) is the probability density of pi at time t. This PDE is difficult to solve analytically 
and to our knowledge no analytical solution has been determined for the general case of selec- 
tion, mutation and drift. In this case numerical methods have to be used to study the dynamics 
of the allele frequency distribution. The solution at equilibrium is easier to obtain and was given 
by [Kimura (1955p confirming the result originally found by |Wright (1937p using a different ap- 



proach. For a good discussion of diffusion theory in the context of population genetics see, for 
example, |Crow and Kimura (1970| ). 

We have assumed that selection acts independently at each locus. This assumption can break 
down if L is sufficiently large, because interference between loci results in a reduced effective 
population size and the introduction of significant linkage disequilibrium. This effect can be quan- 
tified if the population is at linkage equilibrium before selection by calculating the fitness variance 
in a finite population. The effective population size can then be determined using the arguments 
developed by p^obertson (1961D . However, such interference will also generate strong linkage 



disequilibrium after selection which remains significant even under free recombination. We will 
therefore assume that L is sufficiently small to make such effects negligible. A necessary condi- 
tion for this to be the case is for = o{L^^) while sufficient conditions depend on the rate of 
recombination. 
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C. Cumulants at linkage equilibrium 



Under multiplicative selection we can consider the number of advantageous alleles per popula- 
tion member Z„ = J2i=i x'i to be a trait undergoing exponential selection. The first two cumulants 
are the mean and variance of the trait distribution while higher cumulants describe deviations from 
a Gaussian distribution. Let Km be the mth cumulant of the population. Each cumulant can be 
generated by differentiating the appropriate generating function ( [Stuart and Ord, 1987D , 

= hm -— G(7) where ^(7) = log E e"^" • (5) 

If we assume a sufficiently large population at linkage equilibrium then Eq. @ can be written in 
terms of allele frequencies by replacing the sum in Eq. by an average over the allele distribution 
defined in Eq. (|T[), 

G'(7) = logE0(a;)e^S»^» 

X 

L 

= Elog(p,(e^-l) + l) . (6) 

i=l 

In terms of allele frequencies the first few cumulants are, 

L 

i=l 
L 

i=l 

Ks = X:Pi(l-P.)(l-2p.). (7) 

i=l 

The first cumulant is the number of advantageous alleles within the population and the second 
cumulant measures the heterozygosity. From these equations we see that the expectation value of 
the trait cumulants is linearly related to the allele frequency moments. 

Notice that each cumulant can be written as a sum over contributions from each locus. This 
is not generally true for other combinations of the trait distribution's moments, such as the cen- 
tral moments. Because each locus is effectively subject to an independent diffusion process the 
central limit theorem ensures that fluctuations in each cumulant over an ensemble of populations 
undergoing an independent diffusion process will decrease with increased L. In the large L limit 
the cumulants will evolve deterministically and will be described by the deterministic equations 
of motion derived below (recall, however, that we require = o(L^) for our approximation to 



be valid, which limits the size of L). For finite L our equations of motion will only give the mean 
dynamical trajectory over an ensemble of populations. Figure |1] shows how fluctuations in the 
cumulant dynamics are reduced as L is increased. In section pi.D| we will show how the scale of 
these fluctuations can be estimated from the expectation value of the cumulants. 



D. Cumulant equations of motion 



To calculate how the generating function for the cumulants changes over time we follow the 
discussion given by pwens (1979| , p 136). First we write, 



Gil) = 9j{Pi) where g^{pi) = log (pi(e^ - 1) + 1) 



(8) 



i=l 



Taylor expanding in orders of Ap, and taking the expectation it is straightforward to show that. 



^t+5t[9^{Pi)] ^ Et[gj{pi)] +Et[a{pi)g' Jpi) + -b{p,)g''{p,)]6t 



(9) 



where a(pi) and b{pi) are defined in Eq. (|3|). In the limit 5t ^ we therefore obtain the rate of 
change of the generating function for the expectation value of the cumulants. 



dt 



i=l 



(10) 



The rate of change in the expectation value for each cumulant is given by differentiating Eq. 



d 9" 
d^Ea^n] = lhn^E, 



(11) 



In order to get equations of motion for the cumulants we first calculate the right hand side of 
Eq. ( [TT| ) in terms of allele frequencies. We then use the relationship between the allele frequencies 
and cumulants in Eq. to obtain equations involving only cumulants. We automated this pro- 



cess using the symbolic programming language Mathematica ( [Wolfram, 19990 . However, since 
submitting this article Priigel-Bennett has obtained a closed form expression for the dynamical 
equations which is given in the authored appendix 0. 

We obtain a linear first order ODE for the expected cumulants kn{t) = Et[L^^Kn] (we scale 
the cumulants by a factor of L^^ for convenience). 



^ = -Mk + d 

dt 



(12) 
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where d = 13,(3, . . -Y and, 



M 





2/3 




-a 

1 + 4/3 -a 
3(1 + 2/5) 






—a 



;i + 8/5) 







2(3 + 4/5) 



V 



(13) 



Unfortunately, the equation for each cumulant involves a cumulant of higher order, so that the 
system is intrinsically infinite dimensional. In the next section we describe some approximate 
solutions to this system of equations. 

The cumulant equations derived above can easily be related to a set of equations describing 
the moments of the allele frequency distribution, since the cumulants are linearly related to these 
moments by Eq. ([7|). However, the results described in the next section will often rely on special 
features displayed by the cumulants which are not shared by the allele frequency moments. In 
particular, the truncated system obtained by setting moments above some order to zero is very 



poorly behaved in contrast to the truncated cumulant system used in sections QI.A and QI.C below. 



III. SOLVING THE DYNAMICS 



The solution to Eq. ( [T2| ) can be written in terms of a diagonal matrix of eigenvalues D = [\i5ij] 
and the associated matrix V whose columns are eigenvectors of M (defined hy D = V~^MV), 



k{t) = k* + Ve-^'V-\k{0) - k*) . (14) 

Here, k* = M^^d is the fixed point. Properties of the fixed point are described in appendix 0. 

We have not been able to find a general result for the eigenvalues and eigenvectors of the full 
infinite dimensional system. In practice we find that a truncated system describes the dynamics 
well for a large range of parameters, as we will show in section |III.A| . In section |III.B| we use a 



perturbation expansion to solve the infinite dimensional system for small a. This expansion agrees 



with the zero mutation rate result due to ( [Kimura, 1955D but only provides a good approximation 



for a ~ 3 or less. In section pi.C| we give an approximation valid for low mutation rates which 
describes the system accurately for larger values of a. This approximation is based on the ob- 
servation that there is a separation of time- scales between the slowest mode and the rest as the 
mutation rate is reduced. This slow mode is strongly coupled to the first cumulant and determines 
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its expected rate of change. The higher cumulants quickly relax to a quasi-equilibrium which can 
be determined analytically for a truncated system. The second cumulant (trait variance) of this 
quasi-equilibrium distribution determines the rate of change in the first cumulant, and hence the 
eigenvalue of the slow mode. The separation of time-scales persists for relatively large selection 
intensities and comparisons with simulations suggest that the resulting approximation works well 
up to at least a ~ 10. 

A. Truncated system 

A finite system of equations can be obtained by truncating the infinite system in Eq. (|T2|). 
We create an ri-dimensional truncated system by only considering cumulants of order n and less. 
Setting Kn+i to zero decouples the equations from higher cumulants. The solution is given by 
Eq. ( [T4l ) with the infinite-dimensional matrix M replaced by the n-dimensional square submatrix 
starting in its top left-hand corner. The resulting finite system provides a very good approximation 
for the dynamics of the lower order cumulants as long as a is not too large. An increased order of 
truncation is required to achieve accurate results as a is increased. This is demonstrated in Fig. (|^) 
where we compare the fixed point of the cumulant equations for the truncated system to the known 
exact result derived from Wright's distribution (see appendix^. As the truncation order increases 
the approximation rapidly converges toward the correct result. However, for larger a the speed 
of convergence is slower. It is interesting to note that the results do not seem to depend strongly 
on the mutation rate. This suggests that the truncated system will work well for a large range of 
mutation rates but may break down for large a. In particular, we have limited our simulations to 
a < 10 although a higher order truncation might allow accurate results for stronger selection. 

In Fig. (Q) we compare the solution of the eighth order truncated system to averaged simulation 
results. The symbols show the first two cumulants for a = 1, 5 and 10 and solid lines show the 
theory. We observe excellent agreement between the theory and simulations. A similar truncated 
system describing the dynamics of the allele frequency moments does not reproduce the dynamics 
or fixed point well unless selection is very weak (a <^ 1) or mutation rates are high. 
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B. Perturbation theory 



Perturbation theory allows the solution to an intractable linear system by expansion around a 
solvable limiting case (see, for example, Pender and Orszag, 1978D . For a = the Jacobian matrix 



M is lower triangular and the eigenvalues and eigenvectors can be determined exactly even for the 
full infinite dimensional system. Indeed, the diffusion theory for « = has an exact closed form 
solution dKimura, 1964]) and one can verify that the known eigenvalues agree with our zeroth order 



result. We expand around the zeroth order result to obtain a perturbation expansion of arbitrary 
order, as described in appendix 0. 

The 4th order expansion is given for two eigenvalues by Eqs. (|B12p and (|B13|). For small P 



these are the smallest eigenvalues and we observe that Ai oc /3 for all orders of expansion. For 
P = the first eigenvalue is therefore zero and A2 determines the rate of decay under selection and 
drift. The expansion for A2 in this limiting case agrees with the expansion found by [Kimura (1955| ) 
from an eigenvalue expansion of the diffusion equation. To our knowledge no similar result has 
previously been obtained for the case of mutation, selection and drift studied here. 

In Fig. we compare the 2nd and 6th order perturbation expansion result to eigenvalues of a 
truncated system. The truncation order was increased well beyond the point where the eigenvalues 
were observed to converge onto a limiting value. These figures suggest that the perturbation theory 
provides a good approximation up until about a = 3, which agrees with Kimura's observation in 
the zero mutation case (JKimura, 195 51). 



C. Separation of time-scales and adiabatic elimination 

We will often be interested in the limiting case where the mutation rate is small. In this limit 
we find there is a separation of time-scales between the slowest mode, which is 0(/3), and the rest, 
which are 0(1). This picture was confirmed for small a by the perturbation theory described in 
section pi.B| and appendix 0. Figure ^ suggests that this behaviour also holds for larger selection 



intensities. In practice the dynamics will quickly relax to the slow mode, which will determine 
most of the system's long term dynamics. Figure ^ shows the averaged velocity of a population as 
it approaches the fixed point. The simulation results approach an asymptotic value (derived below) 
as P is decreased. Notice that the velocity is proportional to /3 and that the slow mode becomes 
increasingly dominant as /3 is reduced. Similar simulations were used by [Priigel-Bennett (19^^ 



10 



in order to compare the dynamics of populations with and without crossover, although the scaling 
relationship with the mutation rate was not considered explicitly in that study. 

Once the population reaches the steady state it moves as a travelling wave, as has previously 
been observed by [Burger (1993| ), in marked contrast to the infinite population dynamics. Burger 
argues that the velocity of the population is proportional to the population's variance, which can be 
determined under the assumption of a balance between neutral mutation and drift. In the present 
case mutation is not neutral, and does change the population mean, so that Burger's approximation 
is not strictly applicable. Also, we find that the population's variance depends on the selection 
intensity and cannot be calculated under a neutral model. However, the picture of a travelling wave 
is still applicable and for our model we can derive an analytical expression for the population's 
velocity in the steady state in a similar spirit to Burger's approximation. 

The asymptotic result shown in Fig. ^ is obtained by assuming an adiabatic elimination of fast 
variables for small (3 (for a discussion on adiabatic elimination see, for example, [Gardiner, 1985| ). 
We make the assumption that most of the first cumulant dynamics is determined by the slow mode. 
This is borne out by examination of the eigenvectors derived from perturbation theory, which show 
that the fast modes only couple to the first cumulant through terms which are 0(/3) (this is demon- 
strated in appendix 0). The higher cumulants have significant contributions from the fast modes. 
We can therefore think of the higher cumulants as fast variables which will rapidly converge to 
a quasi-equilibrium value which then changes slowly according to the slow mode. The coupling 
of the higher cumulants to the slow mode is through the first cumulant, so the quasi-equilibrium 
value is determined by solving the fixed point equations for the second and higher order cumulants 
for a particular value of the first cumulant. To calculate this fixed point we have to truncate the 



dynamical equations, as described in section pi.A| , and the quasi-equilibrium cumulants are found 
by solving a linear equation, 

n 

J2 MijkJ = P- Mah for 2 < i < n , (15) 

where n is the truncation order. We then use the quasi-equilibrium variance k2 to determine the 
rate of change of the first cumulant according to Eq. (|T^, 

-^ = ak'^ + (3{l-2h). (16) 

For an eighth-order cumulant truncation we find the following expression for the quasi-equilibrium 
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variance to first order in (3, 

^ ( 126a(2A:,-l)(4200 + 220a^ + a^) \ 
^ ^ V 1587600 + 189000^2 + 2898^4 + J ' ^ ^ 

Higher order truncations result in similar rational expressions with higher order polynomials in a 
appearing in the numerator and denominator. The above expression was used to plot the theoretical 
lines in Fig. @. The adiabatic elimination result is self-consistent since Eq. ( [T^ demonstrates 
that only the 0{(3) leading eigenvalue contributes to the decay of the first cumulant, as we initially 
assumed. 

For weak selection, perturbation theory confirms the adiabatic elimination result. The term 
proportional to ki on the right hand side of Eq. (|16|) gives the leading eigenvalue and we can 
expand in a. 



Comparison with Eq. ( |B12|) to first order in P shows that the results agree. However, the simu- 
lations in Fig. ^ indicate that the adiabatic elimination result provides a good approximation for 
much larger selection intensities than covered by the perturbation theory. 

D, Fluctuations 

We have mainly considered the averaged cumulant dynamics. However, the theory described 
here can also be used to estimate fluctuations from expected behaviour if we assume the same 
initial conditions at all loci. In this case there is an identical and independent diffusion process 
at all loci and we can consider the expected cumulants to be ensemble averages over loci and/or 
populations. For example, the variance of the mean is then given by. 



1 ^ 



^ i=l 



YariL'^Ki) = E 

= L-'[h{l - h) ~ h] . (19) 



1 ^ 



^ i=l 



where A;„ = Et[L^^Kn]. This shows how the fluctuations fall off with increasing L, as previously 
observed in Fig. |T]. We can similarly calculate the ensemble variance for the higher cumulants. 
It should be noted that the cumulants here describe an infinite population from which our finite 
population can be considered a sample. This sampling procedure will also introduce fluctuations 
in the measured cumulants, as quantified by Priigel-Bennett (1997[ ). 
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IV. CONCLUSION 



We studied the cumulant dynamics for a classical population genetics model, a finite popula- 
tion with two alleles at each locus evolving under multiplicative selection and reversible mutation. 
We identified the number of advantageous alleles as a trait undergoing exponential selection and 
the cumulants of the trait distribution were shown to be linearly related to moments of the allele 
frequency distribution. The dynamical equations were derived and formed a coupled infinite- 
dimensional linear system. A truncated system was shown to provide an excellent approximation 
to the full infinite-dimensional system for a broad range of parameters. For weak selection we de- 
veloped an analytical approximation to the eigensystem of the full infinite-dimensional case using 
a perturbation expansion around the solvable limit of zero selection. To all orders in perturbation 
theory a single eigenvalue was shown to be proportional to the mutation rate, resulting in a separa- 
tion of time-scales between different modes for low mutation rates. This separation of time-scales 
was observed to persist for stronger selection and allowed us to develop an analytical approxima- 
tion to the leading eigenvalue in the limit of weak mutation. This approximation was shown to 
provide good results for larger selection intensities than covered by the perturbation expansion. 

Cumulants have been shown to be useful for modelling a variety of polygenic systems (see, for 
example, Piirger, 2000D and here we have shown their potential for describing a finite population 
where genetic drift is non-negligible. Cumulants have a number of features which make them 
useful for modelling finite populations close to linkage equilibrium. Firstly, it was noted that each 
cumulants at linkage equilibrium can be written as a sum over contributions from each locus. This 
is not generally the case for other combinations of the trait moments, such as the central moments, 
and suggests that populations may be better described using cumulants rather than central moments 
since fluctuations will be reduced according to the central limit theorem. For the multiplicative 
selection case considered here the equations of motion were also linear, in which case fluctuations 
do not lead to strong systematic effects. This picture should be contrasted with cumulant equations 
describing finite asexual populations under multiplicative selection which are intrinsically far from 
linkage equilibrium, in which case fluctuations do contribute systematic effects and must be mod- 
elled explicitly dPriigel-Bennett, 1997| ). Secondly, although the trait cumulants are linearly related 
to the allele frequency moments, the latter do not share certain advantageous features displayed 
by the former. This is because the truncated equations of motion for the allele frequency moments 
do not describe the infinite-dimensional system well except under weak selection or high mutation 
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rates. 

Whether cumulant dynamics will prove generally useful for describing finite populations with 
significant linkage disequilibrium is an open question. Infinite populations with linkage disequilib- 
rium have recently been modelled using factorial cumulants ( pawson, 1999| ) but as we mentioned 
above, drift will introduce systematic effects which are typically difficult to model. Although ap- 
proximations have been introduced which attempt to capture these fluctuations (Priigel-Bennett, 

I I 

1997), the complexity and non-linearity of the resulting dynamical system make it difficult to draw 

general conclusions. 
Acknowledgements 

We would like to thank Adam Priigel-Bennett and Nick Barton for useful comments on a pre- 
liminary version of this paper. 

References 

C. M. Bender and S. A. Orszag. Advanced mathematical methods for scientists and engineers. McGraw- 
Hill, New York, 1978. 

M. G. Bulmer. The Mathematical Theory of Quantitative Genetics. Clarendon Press, Oxford, 1980. 
R. Burger. Moments, cumulants and polygenic dynamics. Journal of Mathematical Biology, 30:199-213, 
1991. 

R. Burger. Predictions of the dynamics of a polygenic character under directional selection. Journal of 

Theoretical Biology, 162:487-513, 1993. 
R. Burger. The Mathematical Thoery of Selection, Recombination and Mutation. Wiley, Chichester, 2000. 
J. F. Crow and K. M. Kimura. An Introduction to Population Genetics Theory. Harper & Row, New York, 

1970. 

K. J. Dawson. The dynamics of infinitesimally rare alleles, applied to the evolution of mutation rates and 
the expression of deleterious mutations. Theoretical Population Biology, 55:1-22, 1999. 

W. J. Ewens. Mathematical Population Genetics. Springer- Verlag, Berlin, 1979. 

C. W. Gardiner. Handbook of Stochastic Methods. Springer- Verlag, New York, 1985. 

P. G. Higgs and G. Woodcock. The accumulation of mutations in asexual populations and the structure of 
genealogical trees in the presence of selection. Journal of Mathematical Biology, 33:677-702, 1995. 

14 



M. Kimura. Stochastic processes and distribution of gene frequencies under natural selection. Cold Spring 

Harbor Symposia on Quantitative Biology, 20:33-53, 1955. 
M. Kimura. Diffusion models in population genetics. Journal of Applied Probability, 1:177-232, 1964. 
A. Priigel-Bennett. Modelling evolving populations. Journal of Theoretical Biology, 185:81-95, 1997. 
A. Priigel-Bennett. On the limit of long strings. In W Banzhaf and C Reeves, editors. Foundations of 

Genetic Algorithms 5, pages 45-56. Morgan Kaufmann, San Francisco, 1999. 
A. Priigel-Bennett and J. L. Shapiro. An analysis of genetic algorithms using statistical mechanics. Physical 

Review Letters, 72(9): 1305-1309, 1994. 
M. Rattray and J. L. Shapiro. The dynamics of a genetic algorithm for a simple learning problem. Journal 

of Physics, A:745 1-7473, 1996. 
A.Robertson. Inbreeding in artificial selection programmes. Genetical Research, 2:189-194, 1961. 
A. Stuart and J. K. Ord. Kendall's Advanced Theory of Statistics, Vol 1. Distribution Theory. Oxford 

University Press, New York, 5th edition, 1987. 
M. Turelli and N. H. Barton. Genetic and statistical analysis of strong selection on polygenic traits: What, 

me normal? Genetics, 138:913-941, 1994. 
S. Wolfram. The Mathematica Book. Cambridge University Press, Cambridge, 4th edition, 1999. 
S. Wright. The distribution of gene frequencies in populations. Genetics, 23:307-320, 1937. 



APPENDIX A: Fixed Point 

The fixed point of the allele frequency distribution for the model considered here was first 
determined by [Wright (1937D . [Kimura (1955D later showed this to also be the solution of the 
related diffusion equation with density given by, 

mPl,P2,...,PL})=C-'^f[e"'^^[p^{l-p^)f^-' , (Al) 

i=l 

where, 

and F(; , ; , ; ) is the Kummer confluent Hypergeometric function. The moments of this distribution 
can be shown to be related through the following recursion formula, 

2aE(p^) = {2-n + 2a- 4/?)E(p^-i) - (2 - n - 2(3)E{p'l~^) . (A2) 
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We can use this to determine relationships between the expected population cumulants using the 
definition in Eq. @. For the first few we find, 

ak2 = I3{2ki - 1) , 

ak^ = {1 + a + Af3)k2 - 2l3ki , 

aki = 3(1 - a + 2P)k3 + (3 + 4a + 12/?)A;2 - 6pki . 



As a check one can easily verify that these relationships satisfy Eq. ( [T2| ) with dk/dt = 0. In an 
authored appendix to this paper, Priigel-Bennett shows that the expressions are exactly equivalent 
(appendix 0). 



APPENDIX B: Perturbation theory 

Consider the linear system in Eq. (|12|). To determine the properties of the dynamics, we should 
study the eigenvalues and eigenvectors of M. This we do below by expanding around the solution 
for a = 0. 



1. Notation 

Denote the ith eigenvalue of M as its associated eigenvector is e(i). Denote the associ- 
ated eigenvector of the transpose of M as w(i). In Eq. ([T^, V is a matrix whose ith column is 
e{i) and is a matrix whose ith row is w(iy. 

We will expand in powers of a and use superscripts to denote the order. So, M is written as 

M = M°-aA, (Bl) 

where Ajj = 5ij+i. We likewise expand the eigenvalues and eigenvectors, 

e{i) = e°(i) + ae\i) + a^e^{i) + ... , (B2) 

X(i) = A°(i) + aX\i) + a^X\i) + ... , (B3) 

V = E'^ + aE^ + a'^E'^ + . . . , (B4) 

y-i = w° + aW^ + a^W^ + . . . , (B5) 

D = D° + aD^ + a^D^ + ... . (B6) 

The expansion is simple due to two properties of 
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1. iW" is lower triangular, and 

2. M° alternates with O's. i.e. M^. ^ ^ M^-^-^ = M^^y = 0. 



2. Zeroth Order 



Since is lower triangular, the eigenvalues are simply the diagonal elements, 

\%k) = M,, = ti^^ + 2kp . (B7) 
The eigenvectors can be found recursively via back-substitution. 



0; if J < k 

1; ifj = A; . (B8) 

^m=k xo{k)-xo{j)' omerwise 



The eigenvectors of the transposed system can also be found. The result is 



W%k)j 



2^m=j+l xO(k)-\0{j)-> J ^ 

1; if J = . (B9) 

0; if j > k 



The latter can be found either by truncating the transposed system at some order n, computing the 
first n eigenvectors by back-substitution starting from the lower right, and then taking n ^ oo, or 
by ansatz. We will argue momentarily that these are the only eigenvectors of the infinite system. 

It is clear that the right eigenvectors form a complete set and are defined without requiring 
any finite truncation^ The left eigenvectors can be found only by truncating the system to some 
order or by assuming that w^{k)j = for all j > k. Thus, one might worry that there are other 
possible left eigenvalues. However, since all of the eigenvalues are distinct, the right and left 
eigenvectors must form a mutually orthogonal set. Those given in Eqs. ( p^ ) and ( p9| ) do, and so 
the left eigenvectors must form a complete set as well. 

The result is that and have the same structure as M^; lower triangular with alternating 
zero elements. 



' They are not, however, normalizable in infinite dimensions 
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3. Arbitrary Order 

The nth order approximation to the eigenvalues and eigenvectors can be found in terms of the 
lower order approximations. The recursion is 

n-l 

M'^E'' - E'^D^ - D'^E^ = AE""-^ + ^ E""'^ . (BIO) 

i=2 

This must be solved for the nth order contribution to the eigenvectors, E^, and to the eigenvalues, 
This is solvable because E^ is lower triangular, so the upper triangular part of E" can be 
computed without knowledge of D^. Then, since — is lower triangular with zeros on the 
diagonal, depends only on the upper triangular part of E^. Once this is computed, the lower 
triangular part of can be found. 

This equation does not determine the diagonal part of E"'. This is arbitrarily taken to be 
zero. In other words, V^j = 1 to all orders which is like a choice of normalization for these 
non-normalizable vectors. 

To find the left eigenvectors, we ensure that W is the inverse of E to nth order in a. This 
requires that 

W = - W^E^-^ W"" . (Bll) 

We find that each order introduces one higher upper diagonal element to E and W . Thus, to 
compute A (A;) to order n, one needs to truncate the system at an order higher than A; + n. 



4. Results 



The expansion for the first two eigenvalues is given below, 

^^'^ = + " 3 + 10/3 + 8/?^ - " (1 + 2/?)3(3 + 4/?)3(5 + 4/5) + ^ ' ^^'^^ 

4 1 - 278/5 + 492/32 + 1392/33 - 96/3^ - 2112/3^ - 640/3^ + 512/3^ g 
~ 8(1 + /3)3(1 + 2/3)3(5 + 4/3)3(7 + 4/3) + " 

As seen in Fig. ^ the series expansion to 6th order in a is accurate for a up to about 3. 
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APPENDIX C: Adiabatic elimination 



The dynamical equation can be written as 

V-^^ = -DV-^k + V-'d. (CI) 
dt 

To all orders in perturbation theory and to leading order in /3, this equation takes the following 
form, 

h + afuh + a'^fish + ■ ■ ■ = e"^^^)* (ki + afi2k2 + a'^fisks + •.•)+ constant, 
(5af2iki + ^2 + af2'ih + • • • = e"'^^^)* {P(yf2iki + /c2 + Q;/23fc3 + ...)+ constant, 
Pfsih + a/324 + 4 + ■ ■ ■ = e-^(2)* Wfsih + af32k2 + ak3 + ...)+ constant, 

; ; (C2) 

where kn is the time-derivative of kn and the fi/s are trivially related to the eigenvectors but have 
the leading order in (3 pulled out. The structure of the eigenvectors and eigenvalues reveals three 
points, 

• For all equations beyond the first one, coupling to ki and ki is through terms which are 
0{P). 

• The first eigenvalue is 0{f3); the remaining ones are 0(1). 

• Truncating the equations to have m components results in an error of 0(q;"*~*+^) to the ith 
equation. 

This shows why adiabatic elimination works. For p — 0, {k2, ks ) satisfies a closed set of 

equations completely decoupled from ki. Once the higher cumulants have decayed, ki becomes 
frozen. For /3 small, the coupling is present but weak and the higher cumulants will behave as a 
quasi-closed set decaying at a rate of 0(1). The first cumulant will decay very slowly after the 
higher ones have decayed and will depend on them through their quasi-equilibrium values. As 
long as the selection intensity is not too large, truncating the system will result in small errors. 

APPENDIX D: Explicit Form of the Cumulant Evolution Equations 

Adam Priigel-Bennett 

ECS, University of Southampton, S017 IB J, United Kingdom 
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In this appendix we give an explicit form for the evolution equation (12). We also obtain 
an analogous equation for the moments. We can rederive the diffusion equation for the allele 
frequency distribution (equation (4)) directly from the matrix equation for the moments. This 
illustrates clearly the connection between a matrix equation formulation and a diffusion equation 
formulation. 

This work was completed after receiving a preprint of the above paper by Rattray and Shapiro. 
It seems appropriate to publish these results at the same time as the paper. 

1. Explicit Evolution Equation 

Equation (12) can be written explicitly in terms of binomial coefficients 

^ = /3 + aK^, - X: (2/? (^^ 1) + (^%)) + ^^^n] (Dl) 
where we denote the indicator function by 

{1 if predicate is true 
if predicate is false 

We can write an equivalent set of equations for the allele frequency moments /i„ = Et [p"] , 

^ = -{k + 1) [aii,^, + (^p-a + ^ fik+i - (/^ + ^) ■ (D2) 

This equation still holds for A; = provided we interpret hq = 1. We can obtain the diffusion 
equation (4) directly from this equation. 



SIcetch of Proof 



The proof of equation ( |DT| ) follows from two observation. Unfortunately, it involves rather 
laboured algebraic manipulations. We therefore give only an outline of the proof below. 

The first observation (already discussed in section II) is that the cumulant equations all decou- 
ple over the loci. We can therefore consider the cumulants for a single locus. We can write an 
analogous equation to equation (11) for the single-locus cumulants (we drop the loci subscript i as 
it plays no part in what follows) 



dkn 

dt 



Et 



a{p) 



dK^\p) , bip)d'K:!ip) 



dp 



Qp2 



(D3) 
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with a{p) = ap{l — p) + P{1 — 2p), h{p) = p{l— p). 



and where we have used 



^7" 



^^(p)=log(p(e^-l) + i: 



(D4) 



fc„ = Et[ir:'(p)] . (D5) 

The kn used here refers to a single-locus cumulant; it differs from kn used in the main paper, which 
is the single-locus cumulant averaged over all loci. Only when the initial conditions at all loci are 
the same will these quantities coincide. 

The second observation is a consequence of equations ( p4D and (p5[), namely that the cumu- 
lants kn are linearly related to the allele frequency moments /i„ = Et[p"]. We can write the 
cumulants as 



k„ = Et 



z>o 



Et 



p 



E 



(D6) 



where 



are the coefficients of the Taylor expansion of Et = g^yip). They are zero for 

I > n. We can write the moments in terms of cumulants 

/ 



E 

n>0 



n 



(D7) 



The rest of the calculation involves expanding K^{p) as a polynomial in p and performing the 
average to obtain equations in terms of the moments. We then expand the moments in terms of the 
cumulants kn- This gives us equations in terms of the coefficients " and |^|. We can then use 
a series of identities for the coefficient to write the solution in the form of equation (pT|). 



Identities for Coefficients 



There a six identities among the coefficients that we use extensively. The first two identities are 



recursion relations that can be derived from the generating function given in equation ( p4D 



n — 1 
I 



(/-I) 



(D8) 



Using the boundary conditions 



/ - 1 

n 



(/-I 

and 



n — 1 



(D9) 



I I = 1 we can generate all the 



coefficients. The next two identities express the fact that the two sets of coefficients allow us to 
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go from cumulants to moments and back to cumulants or from moments to cumulants and back to 
moments 



E 

/>0 



E 

n>0 



l\n 



m\ 





'n' 


it) 


.1. 



\[k = I] 



(DIO) 



(Dll) 



The final identities relate the coefficients | | and 



E 

l>0 



n 
l + l 



E^ 

i>0 



/ n - 



to binomial coefficients 

-\\n 



m\ 



\n+m 



n 

m — 



+l[n = m — 1] 



(D12) 



(D13) 



These two identities can be proved by induction using the recursion relations ( pS| ) and ( p9| ) to- 
gether with the well known recursion relation for binomial coefficients. The coefficients used here 
are analogous to Stirling numbers in that they obey a second order recursion relation, although the 
recursion relations are different. Similar identities hold for Stirling numbers; these are developed 
in the book Concrete Mathematics (Graham, Knuth and Patashnik, 2nd Ed. 1995). 



Derivation of Equation (UTi 



We show explicitly how to obtain the terms in equation ( pTl ) proportional to a and /3, the 
remaining (diffusion) term follows similarly, although it requires a few more lines a algebraic 
manipulation. We start from equation ( P3D . The term proportional to a is given by 



Et 



ap (1 — p)- 



dp 



l>0 



Ip 



l-l 



«E 

l>0 



«EE 

m>0 l>0 



m 



l + l 



m 



"EE(^ 

m>0 Z>0 



n 
l-l 



m 
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m>0 l>0 



n + 1 

I 



I 



m 



where we used equations (p6[), (p7D, ( p8D and ( plOD respectively in the derivation. The term 
proportional to /3 is give by 



Et 



f3{l - 2p) 



dp 



Et 



«>0 



l{p^-^-2p^) 



P 



+ /9E (^ + 1) 

«>0 ^ 



n 
l + l 
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E, 



i>0 



n 
l+l 



/9 + /5EE ((^ + 1) 

«>0 m>0 



?2 + 1 
/ + 1 



m 



m>0 



n 



m — 1 



\n+m 



n 



m — 1 



m>0 



n 

m — 1 



l\n + m, even] k„ 



where we used equations (P4D, (psp, (p7|), ( pi2| ) and ( pi3| ) respectively. The last term follows 
similarly. Putting these together we obtain equation (pTl). 



2. Dynamics of the Moments 



Having developed the machinery to go from cumulants to moments we can now easily obtain 
a recursion relation for the moments. Although cumulant expansions typically have much better 
convergence properties than moment expansions (and are therefore more useful in obtaining ap- 
proximate solutions), for this problem the dynamics for the moments turns out to have a simpler 
form than that for the cumulants. This allows us to rederive the diffusion equation for the allele 
frequencies. 

Using equation ( P^D we can write equation ( p3[ ) in terms of moments 



E 

z>o 



df ~ ^ 



I a{fii - /i;+i) + P{fii-i - 2fii) + 



(/-I) 



(D14) 
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We can obtain a differential equation for the moments by multiplying both sides of equation ( P14| ) 
by I I and summing over n. Using the identity ( pi 1[ ) we find 

-^-^ = (k + l) - Hk+2) + f3{Hk - 2/Xyfc+i) + ^(/ifc - Aifc+i) j • (D15) 

Rearranging gives equation ( p2p . 

Equation ( p2[ ) is a linear transformation of the differential equation (|QT|) for the cumulants. 
To see this more directly we can consider the coefficients f// „ = | as elements of a lower- 
triangular matrix U. The coefficients " are the elements of the matrix U^^. We can write 



equation ( p2D as a matrix equation analogous to equation (12) 



where M = UMIJ-^ and d = Ud with 4 = /5I[A; = 1]. The matrix M is tridiagonal. 



Obtaining the Diffusion Equation 

Using equation (p2D, we can obtain the diffusion equation for allele frequencies. We first obtain 
a partial differential equation for the characteristic function 

\k 



To do this, we multiplying equation ( p2D by (^z)^^^ lk\ and sum over giving 

Taking the inverse Fourier transform we obtain the diffusion equation for the single-locus allele 
frequencies 



dt dyy 2 d'p \ \ 2, 

This can be re- written in the form of equation (4). Setting the right-hand side of this equation to 
zero we easily obtain the fixed point solution 00), which was first found by Wright (1937). 
Since the diffusion equation was derived directly from the moment equations (p2|), the fixed points 
will have the same moments, and as equations ( p2| ) are just a linear transformation of cumulant 
equations (|DT|), the fixed point will have the same cumulants. 
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We can obtain eigenvalue equations from the diffusion equation by looking for solutions of the 
form t) — (t){p, oo) + p{t)'^{p). This reduces the problem to solving a eigenvalue differential 
equation, subject to boundary conditions. However, these equations are, in general, difficult to 
solve as they require solving a boundary value problem, often with complicated boundary condi- 
tions. 

The similarity between diffusion equation problem and quantum mechanics has often been 
observed. Here we note another connection. Moving between a matrix equation and diffusion 
equation parallels the connection between Heisenberg's matrix formulation of quantum mechanics 
and Schrodinger's differential equation. 
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FIG. 1 The first two cumulants are shown from a single realisation of the evolutionary dynamics for popu- 
lations with differing numbers of loci: L = 32 (left) and L = 3200 (right). Both populations evolve under 
multiphcative selection, reversible mutation and free recombination, with TV = 500, a = 5 and /? = 0.01 

(a = Ns and P = Nu). 

FIG. 2 Fixed point cumulant results from the truncated cumulant equations (A;* ) are compared to cumulants 
derived from Wright's distribution (kn) as the truncation order is increased. The top figures show the 
difference between the first and second cumulant results for P = I, using a logarithmic scale. The bottom 
figures show the first cumulant results for lower mutation rates, /? = 0.1 and (3 = 10~^. The symbols 
denote different selection intensities, with a = 1(+), 2(A), 5(o) and 10(x). 

FIG. 3 The averaged dynamics are shown for the first two cumulants with a = l(o), 5(A) and 10(x). 
The eight cumulant theory is shown by the sohd fines. The results were averaged over 500 iterations of the 
evolutionary dynamics with free recombination, N = 200 and (3 = 0.01. In each case the population was 
initialised with all alleles set to zero. 
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FIG. 4 Approximate results for the two lowest eigenvalues given by the perturbation theory are compared 
to results from the 16 cumulant truncated system (solid lines). The 2nd order (dot-dashed) and 6th order 
(dashed) perturbation theory results are shown for various a, for /3 = 0.1 (left) and (3 = 0.01 (right). 

FIG. 5 The average rate of change in the mean is shown as a function of the mean for a = 1 (left) and 
a = 10 (right). The symbols show results averaged over up to 1000 iterations of the evolutionary dynamics, 
with f3 = l(o), O.l(x) and 0.01(A). The solid line shows the eight cumulant adiabatic elimination result 
for small /? (see Eqs. (p^) and ([l7|)). Populations were either initialised with alleles all set to zero, giving a 
positive rate of change, or were initialised with alleles all set to one, giving a negative rate of change. The 
populations were then given enough time to converge close to the fixed point. For a = 1 we chose L = 960 
and N = 100 and for a = 10 we chose L = 96 and = 200. In each case the populations were subject to 
free recombination. 
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